Overview

Dataset statistics

Number of variables40
Number of observations347469
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory106.0 MiB
Average record size in memory320.0 B

Variable types

Numeric10
Categorical30

Alerts

count_floors_pre_eq is highly correlated with height_percentageHigh correlation
height_percentage is highly correlated with count_floors_pre_eqHigh correlation
has_secondary_use is highly correlated with has_secondary_use_agriculture and 1 other fieldsHigh correlation
has_secondary_use_agriculture is highly correlated with has_secondary_useHigh correlation
has_secondary_use_hotel is highly correlated with has_secondary_useHigh correlation
count_floors_pre_eq is highly correlated with height_percentageHigh correlation
height_percentage is highly correlated with count_floors_pre_eqHigh correlation
has_secondary_use is highly correlated with has_secondary_use_agriculture and 1 other fieldsHigh correlation
has_secondary_use_agriculture is highly correlated with has_secondary_useHigh correlation
has_secondary_use_hotel is highly correlated with has_secondary_useHigh correlation
count_floors_pre_eq is highly correlated with height_percentageHigh correlation
height_percentage is highly correlated with count_floors_pre_eqHigh correlation
has_secondary_use is highly correlated with has_secondary_use_agriculture and 1 other fieldsHigh correlation
has_secondary_use_agriculture is highly correlated with has_secondary_useHigh correlation
has_secondary_use_hotel is highly correlated with has_secondary_useHigh correlation
has_superstructure_cement_mortar_brick is highly correlated with ground_floor_type and 1 other fieldsHigh correlation
ground_floor_type is highly correlated with has_superstructure_cement_mortar_brickHigh correlation
has_superstructure_rc_non_engineered is highly correlated with foundation_typeHigh correlation
has_secondary_use_agriculture is highly correlated with has_secondary_useHigh correlation
roof_type is highly correlated with foundation_type and 1 other fieldsHigh correlation
has_secondary_use_hotel is highly correlated with has_secondary_useHigh correlation
has_superstructure_mud_mortar_stone is highly correlated with foundation_typeHigh correlation
has_secondary_use is highly correlated with has_secondary_use_agriculture and 1 other fieldsHigh correlation
foundation_type is highly correlated with has_superstructure_cement_mortar_brick and 4 other fieldsHigh correlation
has_superstructure_rc_engineered is highly correlated with foundation_typeHigh correlation
other_floor_type is highly correlated with roof_typeHigh correlation
geo_level_1_id is highly correlated with foundation_typeHigh correlation
count_floors_pre_eq is highly correlated with height_percentage and 1 other fieldsHigh correlation
height_percentage is highly correlated with count_floors_pre_eq and 1 other fieldsHigh correlation
foundation_type is highly correlated with geo_level_1_id and 2 other fieldsHigh correlation
roof_type is highly correlated with foundation_type and 2 other fieldsHigh correlation
ground_floor_type is highly correlated with foundation_type and 1 other fieldsHigh correlation
other_floor_type is highly correlated with count_floors_pre_eq and 6 other fieldsHigh correlation
position is highly correlated with has_superstructure_mud_mortar_brickHigh correlation
has_superstructure_mud_mortar_stone is highly correlated with other_floor_type and 2 other fieldsHigh correlation
has_superstructure_mud_mortar_brick is highly correlated with position and 1 other fieldsHigh correlation
has_superstructure_cement_mortar_brick is highly correlated with other_floor_type and 1 other fieldsHigh correlation
has_superstructure_timber is highly correlated with has_superstructure_bambooHigh correlation
has_superstructure_bamboo is highly correlated with has_superstructure_timberHigh correlation
has_superstructure_rc_non_engineered is highly correlated with other_floor_typeHigh correlation
has_superstructure_rc_engineered is highly correlated with other_floor_typeHigh correlation
has_secondary_use is highly correlated with has_secondary_use_agriculture and 1 other fieldsHigh correlation
has_secondary_use_agriculture is highly correlated with has_secondary_useHigh correlation
has_secondary_use_hotel is highly correlated with has_secondary_useHigh correlation
building_id has unique values Unique
geo_level_1_id has 5358 (1.5%) zeros Zeros
age has 34725 (10.0%) zeros Zeros
count_families has 27937 (8.0%) zeros Zeros

Reproduction

Analysis started2022-04-28 14:33:31.438641
Analysis finished2022-04-28 14:36:12.222184
Duration2 minutes and 40.78 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

Distinct260601
Distinct (%)75.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean108583.1875
Minimum0
Maximum260600
Zeros2
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size2.7 MiB
2022-04-28T16:36:12.367954image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile8686.4
Q143433
median86867
Q3173733
95-th percentile243226.6
Maximum260600
Range260600
Interquartile range (IQR)130300

Descriptive statistics

Standard deviation76266.72844
Coefficient of variation (CV)0.7023806374
Kurtosis-1.105325047
Mean108583.1875
Median Absolute Deviation (MAD)57911
Skewness0.4155819232
Sum3.772929158 × 1010
Variance5816613867
MonotonicityNot monotonic
2022-04-28T16:36:12.633840image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
02
 
< 0.1%
579112
 
< 0.1%
579192
 
< 0.1%
579182
 
< 0.1%
579172
 
< 0.1%
579162
 
< 0.1%
579152
 
< 0.1%
579142
 
< 0.1%
579132
 
< 0.1%
579122
 
< 0.1%
Other values (260591)347449
> 99.9%
ValueCountFrequency (%)
02
< 0.1%
12
< 0.1%
22
< 0.1%
32
< 0.1%
42
< 0.1%
52
< 0.1%
62
< 0.1%
72
< 0.1%
82
< 0.1%
92
< 0.1%
ValueCountFrequency (%)
2606001
< 0.1%
2605991
< 0.1%
2605981
< 0.1%
2605971
< 0.1%
2605961
< 0.1%
2605951
< 0.1%
2605941
< 0.1%
2605931
< 0.1%
2605921
< 0.1%
2605911
< 0.1%

building_id
Real number (ℝ≥0)

UNIQUE

Distinct347469
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean525913.5838
Minimum4
Maximum1052934
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.7 MiB
2022-04-28T16:36:12.899644image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile52200.8
Q1261999
median526071
Q3789588
95-th percentile1000694
Maximum1052934
Range1052930
Interquartile range (IQR)527589

Descriptive statistics

Standard deviation304354.4791
Coefficient of variation (CV)0.5787157595
Kurtosis-1.201737909
Mean525913.5838
Median Absolute Deviation (MAD)263777
Skewness0.001061379559
Sum1.827386671 × 1011
Variance9.263164894 × 1010
MonotonicityNot monotonic
2022-04-28T16:36:13.277598image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8029061
 
< 0.1%
488391
 
< 0.1%
8461601
 
< 0.1%
6030421
 
< 0.1%
2780561
 
< 0.1%
5572381
 
< 0.1%
2330081
 
< 0.1%
7513161
 
< 0.1%
6792651
 
< 0.1%
4639521
 
< 0.1%
Other values (347459)347459
> 99.9%
ValueCountFrequency (%)
41
< 0.1%
71
< 0.1%
81
< 0.1%
121
< 0.1%
131
< 0.1%
161
< 0.1%
171
< 0.1%
251
< 0.1%
281
< 0.1%
311
< 0.1%
ValueCountFrequency (%)
10529341
< 0.1%
10529311
< 0.1%
10529291
< 0.1%
10529261
< 0.1%
10529231
< 0.1%
10529211
< 0.1%
10529151
< 0.1%
10529111
< 0.1%
10529091
< 0.1%
10529081
< 0.1%

geo_level_1_id
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct31
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.89731458
Minimum0
Maximum30
Zeros5358
Zeros (%)1.5%
Negative0
Negative (%)0.0%
Memory size2.7 MiB
2022-04-28T16:36:13.525495image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3
Q17
median12
Q321
95-th percentile27
Maximum30
Range30
Interquartile range (IQR)14

Descriptive statistics

Standard deviation8.032596704
Coefficient of variation (CV)0.5779963213
Kurtosis-1.212221228
Mean13.89731458
Median Absolute Deviation (MAD)6
Skewness0.2736617617
Sum4828886
Variance64.52260981
MonotonicityNot monotonic
2022-04-28T16:36:13.742735image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
632485
 
9.3%
2630002
 
8.6%
1029399
 
8.5%
1729265
 
8.4%
725565
 
7.4%
825465
 
7.3%
2022761
 
6.6%
2119944
 
5.7%
419462
 
5.6%
2716786
 
4.8%
Other values (21)96335
27.7%
ValueCountFrequency (%)
05358
 
1.5%
13588
 
1.0%
21221
 
0.4%
39995
 
2.9%
419462
5.6%
53579
 
1.0%
632485
9.3%
725565
7.4%
825465
7.3%
95213
 
1.5%
ValueCountFrequency (%)
303595
 
1.0%
29537
 
0.2%
28349
 
0.1%
2716786
4.8%
2630002
8.6%
257489
 
2.2%
241737
 
0.5%
231498
 
0.4%
228358
 
2.4%
2119944
5.7%

geo_level_2_id
Real number (ℝ≥0)

Distinct1418
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean701.8380517
Minimum0
Maximum1427
Zeros53
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size2.7 MiB
2022-04-28T16:36:13.988631image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile69
Q1350
median706
Q31050
95-th percentile1377
Maximum1427
Range1427
Interquartile range (IQR)700

Descriptive statistics

Standard deviation412.8756742
Coefficient of variation (CV)0.5882776991
Kurtosis-1.190841647
Mean701.8380517
Median Absolute Deviation (MAD)349
Skewness0.02506845849
Sum243866966
Variance170466.3224
MonotonicityNot monotonic
2022-04-28T16:36:14.241469image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
395367
 
1.5%
1583317
 
1.0%
1812805
 
0.8%
13872675
 
0.8%
1572548
 
0.7%
3632343
 
0.7%
4632310
 
0.7%
6732249
 
0.6%
5332217
 
0.6%
8832129
 
0.6%
Other values (1408)319509
92.0%
ValueCountFrequency (%)
053
 
< 0.1%
1252
0.1%
390
 
< 0.1%
4415
0.1%
531
 
< 0.1%
63
 
< 0.1%
7131
 
< 0.1%
8164
 
< 0.1%
9443
0.1%
10489
0.1%
ValueCountFrequency (%)
14278
 
< 0.1%
1426378
0.1%
1425611
0.2%
14248
 
< 0.1%
14233
 
< 0.1%
1422277
0.1%
1421327
0.1%
142012
 
< 0.1%
1419131
 
< 0.1%
1418214
 
0.1%

geo_level_3_id
Real number (ℝ≥0)

Distinct11861
Distinct (%)3.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6258.84676
Minimum0
Maximum12567
Zeros4
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size2.7 MiB
2022-04-28T16:36:14.510426image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile614
Q13073
median6271
Q39414
95-th percentile11928
Maximum12567
Range12567
Interquartile range (IQR)6341

Descriptive statistics

Standard deviation3646.950564
Coefficient of variation (CV)0.582687307
Kurtosis-1.213916607
Mean6258.84676
Median Absolute Deviation (MAD)3173
Skewness0.0005991846632
Sum2174755225
Variance13300248.41
MonotonicityNot monotonic
2022-04-28T16:36:14.792813image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9133856
 
0.2%
633841
 
0.2%
621701
 
0.2%
11246633
 
0.2%
11440614
 
0.2%
2005613
 
0.2%
7723594
 
0.2%
9229516
 
0.1%
2452447
 
0.1%
10445402
 
0.1%
Other values (11851)341252
98.2%
ValueCountFrequency (%)
04
 
< 0.1%
16
 
< 0.1%
22
 
< 0.1%
313
 
< 0.1%
41
 
< 0.1%
516
 
< 0.1%
634
< 0.1%
72
 
< 0.1%
844
< 0.1%
94
 
< 0.1%
ValueCountFrequency (%)
125672
 
< 0.1%
125662
 
< 0.1%
125658
 
< 0.1%
125647
 
< 0.1%
1256332
< 0.1%
125624
 
< 0.1%
1256125
< 0.1%
1256021
< 0.1%
125598
 
< 0.1%
125586
 
< 0.1%

count_floors_pre_eq
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.130578555
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.7 MiB
2022-04-28T16:36:15.020423image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median2
Q32
95-th percentile3
Maximum9
Range8
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.72776061
Coefficient of variation (CV)0.3415788675
Kurtosis2.36001261
Mean2.130578555
Median Absolute Deviation (MAD)0
Skewness0.8418180575
Sum740310
Variance0.5296355054
MonotonicityNot monotonic
2022-04-28T16:36:15.203954image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
2209029
60.2%
374171
 
21.3%
153705
 
15.5%
47186
 
2.1%
53039
 
0.9%
6283
 
0.1%
752
 
< 0.1%
83
 
< 0.1%
91
 
< 0.1%
ValueCountFrequency (%)
153705
 
15.5%
2209029
60.2%
374171
 
21.3%
47186
 
2.1%
53039
 
0.9%
6283
 
0.1%
752
 
< 0.1%
83
 
< 0.1%
91
 
< 0.1%
ValueCountFrequency (%)
91
 
< 0.1%
83
 
< 0.1%
752
 
< 0.1%
6283
 
0.1%
53039
 
0.9%
47186
 
2.1%
374171
 
21.3%
2209029
60.2%
153705
 
15.5%

age
Real number (ℝ≥0)

ZEROS

Distinct42
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26.53881353
Minimum0
Maximum995
Zeros34725
Zeros (%)10.0%
Negative0
Negative (%)0.0%
Memory size2.7 MiB
2022-04-28T16:36:15.441399image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q110
median15
Q330
95-th percentile60
Maximum995
Range995
Interquartile range (IQR)20

Descriptive statistics

Standard deviation73.52774868
Coefficient of variation (CV)2.770574072
Kurtosis157.3751623
Mean26.53881353
Median Absolute Deviation (MAD)10
Skewness12.19598992
Sum9221415
Variance5406.329825
MonotonicityNot monotonic
2022-04-28T16:36:15.684364image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=42)
ValueCountFrequency (%)
1051680
14.9%
1548074
13.8%
545045
13.0%
2042792
12.3%
034725
10.0%
2532586
9.4%
3023977
6.9%
3514420
 
4.2%
4014050
 
4.0%
509619
 
2.8%
Other values (32)30501
8.8%
ValueCountFrequency (%)
034725
10.0%
545045
13.0%
1051680
14.9%
1548074
13.8%
2042792
12.3%
2532586
9.4%
3023977
6.9%
3514420
 
4.2%
4014050
 
4.0%
456255
 
1.8%
ValueCountFrequency (%)
9951851
0.5%
200140
 
< 0.1%
1952
 
< 0.1%
1905
 
< 0.1%
1851
 
< 0.1%
18011
 
< 0.1%
1755
 
< 0.1%
1707
 
< 0.1%
1652
 
< 0.1%
1608
 
< 0.1%

area_percentage
Real number (ℝ≥0)

Distinct86
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.017014467
Minimum1
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.7 MiB
2022-04-28T16:36:15.932919image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q15
median7
Q39
95-th percentile16
Maximum100
Range99
Interquartile range (IQR)4

Descriptive statistics

Standard deviation4.388646483
Coefficient of variation (CV)0.5474165602
Kurtosis30.64344074
Mean8.017014467
Median Absolute Deviation (MAD)2
Skewness3.53162645
Sum2785664
Variance19.26021795
MonotonicityNot monotonic
2022-04-28T16:36:16.207640image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
655959
16.1%
749140
14.1%
543556
12.5%
837988
10.9%
929572
8.5%
425675
7.4%
1021030
 
6.1%
1118390
 
5.3%
315687
 
4.5%
1210148
 
2.9%
Other values (76)40324
11.6%
ValueCountFrequency (%)
1125
 
< 0.1%
24275
 
1.2%
315687
 
4.5%
425675
7.4%
543556
12.5%
655959
16.1%
749140
14.1%
837988
10.9%
929572
8.5%
1021030
 
6.1%
ValueCountFrequency (%)
1001
 
< 0.1%
963
< 0.1%
923
< 0.1%
901
 
< 0.1%
867
< 0.1%
855
< 0.1%
843
< 0.1%
834
< 0.1%
821
 
< 0.1%
811
 
< 0.1%

height_percentage
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct29
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.4347985
Minimum2
Maximum32
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.7 MiB
2022-04-28T16:36:16.444195image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile3
Q14
median5
Q36
95-th percentile9
Maximum32
Range30
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.915555029
Coefficient of variation (CV)0.3524610948
Kurtosis13.53489828
Mean5.4347985
Median Absolute Deviation (MAD)1
Skewness1.762884329
Sum1888424
Variance3.669351069
MonotonicityNot monotonic
2022-04-28T16:36:16.652183image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=29)
ValueCountFrequency (%)
5104869
30.2%
661837
17.8%
450427
14.5%
747360
13.6%
334535
 
9.9%
818460
 
5.3%
212348
 
3.6%
97146
 
2.1%
105934
 
1.7%
121246
 
0.4%
Other values (19)3307
 
1.0%
ValueCountFrequency (%)
212348
 
3.6%
334535
 
9.9%
450427
14.5%
5104869
30.2%
661837
17.8%
747360
13.6%
818460
 
5.3%
97146
 
2.1%
105934
 
1.7%
111242
 
0.4%
ValueCountFrequency (%)
3290
< 0.1%
312
 
< 0.1%
291
 
< 0.1%
282
 
< 0.1%
263
 
< 0.1%
254
 
< 0.1%
246
 
< 0.1%
2312
 
< 0.1%
223
 
< 0.1%
2121
 
< 0.1%
Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
t
288937 
n
47413 
o
 
11119

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowt
2nd rowo
3rd rowt
4th rowt
5th rowt

Common Values

ValueCountFrequency (%)
t288937
83.2%
n47413
 
13.6%
o11119
 
3.2%

Length

2022-04-28T16:36:16.970413image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-28T16:36:17.100649image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
t288937
83.2%
n47413
 
13.6%
o11119
 
3.2%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

foundation_type
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
r
292374 
w
 
20048
u
 
18908
i
 
14182
h
 
1957

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowr
2nd rowr
3rd rowr
4th rowr
5th rowr

Common Values

ValueCountFrequency (%)
r292374
84.1%
w20048
 
5.8%
u18908
 
5.4%
i14182
 
4.1%
h1957
 
0.6%

Length

2022-04-28T16:36:17.235155image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-28T16:36:17.370214image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
r292374
84.1%
w20048
 
5.8%
u18908
 
5.4%
i14182
 
4.1%
h1957
 
0.6%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

roof_type
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
n
243975 
q
81905 
x
 
21589

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rown
2nd rown
3rd rown
4th rown
5th rown

Common Values

ValueCountFrequency (%)
n243975
70.2%
q81905
 
23.6%
x21589
 
6.2%

Length

2022-04-28T16:36:17.516475image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-28T16:36:17.645642image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
n243975
70.2%
q81905
 
23.6%
x21589
 
6.2%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

ground_floor_type
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
f
279591 
x
33109 
v
32731 
z
 
1334
m
 
704

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowf
2nd rowx
3rd rowf
4th rowf
5th rowf

Common Values

ValueCountFrequency (%)
f279591
80.5%
x33109
 
9.5%
v32731
 
9.4%
z1334
 
0.4%
m704
 
0.2%

Length

2022-04-28T16:36:17.774694image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-28T16:36:17.909213image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
f279591
80.5%
x33109
 
9.5%
v32731
 
9.4%
z1334
 
0.4%
m704
 
0.2%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

other_floor_type
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
q
220286 
x
58139 
j
52912 
s
 
16132

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowq
2nd rowq
3rd rowx
4th rowx
5th rowx

Common Values

ValueCountFrequency (%)
q220286
63.4%
x58139
 
16.7%
j52912
 
15.2%
s16132
 
4.6%

Length

2022-04-28T16:36:18.055390image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-28T16:36:18.184986image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
q220286
63.4%
x58139
 
16.7%
j52912
 
15.2%
s16132
 
4.6%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

position
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
s
269463 
t
57258 
j
 
17647
o
 
3101

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowt
2nd rows
3rd rowt
4th rows
5th rows

Common Values

ValueCountFrequency (%)
s269463
77.6%
t57258
 
16.5%
j17647
 
5.1%
o3101
 
0.9%

Length

2022-04-28T16:36:18.353370image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-28T16:36:18.487023image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
s269463
77.6%
t57258
 
16.5%
j17647
 
5.1%
o3101
 
0.9%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
d
333327 
q
 
7641
u
 
4909
c
 
450
s
 
449
Other values (5)
 
693

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowd
2nd rowd
3rd rowd
4th rowd
5th rowd

Common Values

ValueCountFrequency (%)
d333327
95.9%
q7641
 
2.2%
u4909
 
1.4%
c450
 
0.1%
s449
 
0.1%
a353
 
0.1%
o195
 
0.1%
m64
 
< 0.1%
n54
 
< 0.1%
f27
 
< 0.1%

Length

2022-04-28T16:36:18.624860image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-28T16:36:18.793355image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
d333327
95.9%
q7641
 
2.2%
u4909
 
1.4%
c450
 
0.1%
s449
 
0.1%
a353
 
0.1%
o195
 
0.1%
m64
 
< 0.1%
n54
 
< 0.1%
f27
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
316554 
1
 
30915

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
0316554
91.1%
130915
 
8.9%

Length

2022-04-28T16:36:18.980872image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-28T16:36:19.108364image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
0316554
91.1%
130915
 
8.9%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

has_superstructure_mud_mortar_stone
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
1
264798 
0
82671 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
1264798
76.2%
082671
 
23.8%

Length

2022-04-28T16:36:19.232980image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-28T16:36:19.360440image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
1264798
76.2%
082671
 
23.8%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
335528 
1
 
11941

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0335528
96.6%
111941
 
3.4%

Length

2022-04-28T16:36:19.481625image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-28T16:36:19.605409image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
0335528
96.6%
111941
 
3.4%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
341104 
1
 
6365

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0341104
98.2%
16365
 
1.8%

Length

2022-04-28T16:36:19.726728image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-28T16:36:19.852923image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
0341104
98.2%
16365
 
1.8%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

has_superstructure_mud_mortar_brick
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
323848 
1
 
23621

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0323848
93.2%
123621
 
6.8%

Length

2022-04-28T16:36:19.975277image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-28T16:36:20.203832image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
0323848
93.2%
123621
 
6.8%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

has_superstructure_cement_mortar_brick
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
321440 
1
 
26029

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0321440
92.5%
126029
 
7.5%

Length

2022-04-28T16:36:20.324228image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-28T16:36:20.447818image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
0321440
92.5%
126029
 
7.5%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

has_superstructure_timber
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
258995 
1
88474 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0258995
74.5%
188474
 
25.5%

Length

2022-04-28T16:36:20.569098image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-28T16:36:20.693079image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
0258995
74.5%
188474
 
25.5%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

has_superstructure_bamboo
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
318046 
1
 
29423

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0318046
91.5%
129423
 
8.5%

Length

2022-04-28T16:36:20.814044image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-28T16:36:20.940705image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
0318046
91.5%
129423
 
8.5%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

has_superstructure_rc_non_engineered
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
332678 
1
 
14791

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0332678
95.7%
114791
 
4.3%

Length

2022-04-28T16:36:21.062156image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-28T16:36:21.185855image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
0332678
95.7%
114791
 
4.3%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

has_superstructure_rc_engineered
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
341964 
1
 
5505

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0341964
98.4%
15505
 
1.6%

Length

2022-04-28T16:36:21.308652image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-28T16:36:21.434213image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
0341964
98.4%
15505
 
1.6%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
342243 
1
 
5226

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0342243
98.5%
15226
 
1.5%

Length

2022-04-28T16:36:21.557292image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-28T16:36:21.679557image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
0342243
98.5%
15226
 
1.5%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
v
334633 
a
 
7307
w
 
3539
r
 
1990

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowv
2nd rowv
3rd rowv
4th rowv
5th rowv

Common Values

ValueCountFrequency (%)
v334633
96.3%
a7307
 
2.1%
w3539
 
1.0%
r1990
 
0.6%

Length

2022-04-28T16:36:21.801359image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-28T16:36:21.930437image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
v334633
96.3%
a7307
 
2.1%
w3539
 
1.0%
r1990
 
0.6%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

count_families
Real number (ℝ≥0)

ZEROS

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.9837395566
Minimum0
Maximum9
Zeros27937
Zeros (%)8.0%
Negative0
Negative (%)0.0%
Memory size2.7 MiB
2022-04-28T16:36:22.055697image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median1
Q31
95-th percentile2
Maximum9
Range9
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.4193854935
Coefficient of variation (CV)0.4263176069
Kurtosis17.24872251
Mean0.9837395566
Median Absolute Deviation (MAD)0
Skewness1.627559333
Sum341819
Variance0.1758841922
MonotonicityNot monotonic
2022-04-28T16:36:22.220345image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
1301377
86.7%
027937
 
8.0%
215010
 
4.3%
32415
 
0.7%
4547
 
0.2%
5135
 
< 0.1%
633
 
< 0.1%
78
 
< 0.1%
94
 
< 0.1%
83
 
< 0.1%
ValueCountFrequency (%)
027937
 
8.0%
1301377
86.7%
215010
 
4.3%
32415
 
0.7%
4547
 
0.2%
5135
 
< 0.1%
633
 
< 0.1%
78
 
< 0.1%
83
 
< 0.1%
94
 
< 0.1%
ValueCountFrequency (%)
94
 
< 0.1%
83
 
< 0.1%
78
 
< 0.1%
633
 
< 0.1%
5135
 
< 0.1%
4547
 
0.2%
32415
 
0.7%
215010
 
4.3%
1301377
86.7%
027937
 
8.0%

has_secondary_use
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
308630 
1
38839 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0308630
88.8%
138839
 
11.2%

Length

2022-04-28T16:36:22.406056image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-28T16:36:22.529269image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
0308630
88.8%
138839
 
11.2%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

has_secondary_use_agriculture
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
325124 
1
 
22345

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0325124
93.6%
122345
 
6.4%

Length

2022-04-28T16:36:22.649941image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-28T16:36:22.772741image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
0325124
93.6%
122345
 
6.4%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

has_secondary_use_hotel
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
335764 
1
 
11705

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0335764
96.6%
111705
 
3.4%

Length

2022-04-28T16:36:22.895126image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-28T16:36:23.018477image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
0335764
96.6%
111705
 
3.4%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
344642 
1
 
2827

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0344642
99.2%
12827
 
0.8%

Length

2022-04-28T16:36:23.139477image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-28T16:36:23.262036image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
0344642
99.2%
12827
 
0.8%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
347136 
1
 
333

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0347136
99.9%
1333
 
0.1%

Length

2022-04-28T16:36:23.489420image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-28T16:36:23.610818image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
0347136
99.9%
1333
 
0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
347343 
1
 
126

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0347343
> 99.9%
1126
 
< 0.1%

Length

2022-04-28T16:36:23.731462image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-28T16:36:23.856609image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
0347343
> 99.9%
1126
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
347103 
1
 
366

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0347103
99.9%
1366
 
0.1%

Length

2022-04-28T16:36:23.977103image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-28T16:36:24.099709image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
0347103
99.9%
1366
 
0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
347411 
1
 
58

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0347411
> 99.9%
158
 
< 0.1%

Length

2022-04-28T16:36:24.219729image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-28T16:36:24.346110image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
0347411
> 99.9%
158
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
347421 
1
 
48

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0347421
> 99.9%
148
 
< 0.1%

Length

2022-04-28T16:36:24.471475image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-28T16:36:24.605052image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
0347421
> 99.9%
148
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
347442 
1
 
27

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0347442
> 99.9%
127
 
< 0.1%

Length

2022-04-28T16:36:24.740794image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-28T16:36:24.881777image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
0347442
> 99.9%
127
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
0
345709 
1
 
1760

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0345709
99.5%
11760
 
0.5%

Length

2022-04-28T16:36:25.037414image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-28T16:36:25.185758image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
0345709
99.5%
11760
 
0.5%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Interactions

2022-04-28T16:36:05.406421image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:39.620849image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:42.414409image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:45.187551image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:48.078597image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:51.158373image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:54.099091image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:56.924492image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:59.847404image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:36:02.586003image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:36:05.678195image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:39.920902image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:42.680253image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:45.458432image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:48.390857image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:51.444326image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:54.372118image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:57.201372image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:36:00.115656image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:36:02.858804image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:36:05.950080image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:40.212010image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:42.953126image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:45.835983image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:48.724612image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:51.733032image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:54.644957image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:57.492683image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:36:00.387625image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:36:03.126859image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:36:06.218086image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:40.486462image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:43.234211image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:46.113213image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:49.042519image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:52.026178image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:54.937632image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:57.778706image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:36:00.665097image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:36:03.398706image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:36:06.491546image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:40.769153image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:43.529579image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:46.407931image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:49.390606image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:52.423352image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:55.223544image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:58.072949image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:36:00.947860image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:36:03.680053image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:36:06.760060image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:41.042840image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:43.809285image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:46.685163image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:49.705579image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:52.712707image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:55.501205image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:58.358441image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:36:01.230921image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:36:03.962299image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:36:07.040131image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:41.336164image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:44.101770image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:46.978641image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:50.013956image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:53.020359image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:55.794144image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:58.757927image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:36:01.522434image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:36:04.250641image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:36:07.300333image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:41.612188image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:44.374479image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:47.251637image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:50.297826image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:53.305645image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:56.066361image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:59.030257image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:36:01.786613image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:36:04.518170image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:36:07.559601image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:41.878120image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:44.644313image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:47.524983image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:50.585271image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:53.573670image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:56.354851image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:59.299355image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:36:02.066331image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:36:04.780979image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:36:07.820094image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:42.155677image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:44.914143image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:47.799678image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:50.881050image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:53.843570image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:56.641900image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:35:59.572187image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:36:02.329818image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-28T16:36:05.152333image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-04-28T16:36:25.438929image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-04-28T16:36:26.104823image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-04-28T16:36:26.709947image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-04-28T16:36:27.433100image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-04-28T16:36:28.053369image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-04-28T16:36:08.422706image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-04-28T16:36:10.697899image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexbuilding_idgeo_level_1_idgeo_level_2_idgeo_level_3_idcount_floors_pre_eqagearea_percentageheight_percentageland_surface_conditionfoundation_typeroof_typeground_floor_typeother_floor_typepositionplan_configurationhas_superstructure_adobe_mudhas_superstructure_mud_mortar_stonehas_superstructure_stone_flaghas_superstructure_cement_mortar_stonehas_superstructure_mud_mortar_brickhas_superstructure_cement_mortar_brickhas_superstructure_timberhas_superstructure_bamboohas_superstructure_rc_non_engineeredhas_superstructure_rc_engineeredhas_superstructure_otherlegal_ownership_statuscount_familieshas_secondary_usehas_secondary_use_agriculturehas_secondary_use_hotelhas_secondary_use_rentalhas_secondary_use_institutionhas_secondary_use_schoolhas_secondary_use_industryhas_secondary_use_health_posthas_secondary_use_gov_officehas_secondary_use_use_policehas_secondary_use_other
0080290664871219823065trnfqtd11000000000v100000000000
11288308900281221087ornxqsd01000000000v100000000000
229494721363897321055trnfxtd01000000000v100000000000
33590882224181069421065trnfxsd01000011000v100000000000
4420194411131148833089trnfxsd10000000000v100000000000
553330208558608921095trnfqsd01000000000v111000000000
6672845194751206622534nrnxqsd01000000000v100000000000
7747551520323122362086twqvxsu00000110000v100000000000
884411260757721921586trqfqsd01000010000v100000000000
999895002688699410134tinvjsd00000100000v100000000000

Last rows

df_indexbuilding_idgeo_level_1_idgeo_level_2_idgeo_level_3_idcount_floors_pre_eqagearea_percentageheight_percentageland_surface_conditionfoundation_typeroof_typeground_floor_typeother_floor_typepositionplan_configurationhas_superstructure_adobe_mudhas_superstructure_mud_mortar_stonehas_superstructure_stone_flaghas_superstructure_cement_mortar_stonehas_superstructure_mud_mortar_brickhas_superstructure_cement_mortar_brickhas_superstructure_timberhas_superstructure_bamboohas_superstructure_rc_non_engineeredhas_superstructure_rc_engineeredhas_superstructure_otherlegal_ownership_statuscount_familieshas_secondary_usehas_secondary_use_agriculturehas_secondary_use_hotelhas_secondary_use_rentalhas_secondary_use_institutionhas_secondary_use_schoolhas_secondary_use_industryhas_secondary_use_health_posthas_secondary_use_gov_officehas_secondary_use_use_policehas_secondary_use_other
3474598685829084217114958073555trnfqsd01000011000v100000000000
34746086859330371455199622595trqfqsd01000000000v100000000000
3474618686069861220173518321095twqfxsu00000010000v100000000000
347462868614451926460925825146trxxssd00000000100v100000000000
347463868626401157116640625165tixvssd00000000010v210100000000
3474648686331002846053623370206trqfqtd01000010000w111000000000
347465868646635671014071190732567nrnfqsd11100000000v100000000000
347466868651049160221136771215033trnfjsd01000010000v100000000000
34746786866442785610419122595trnfqsd11000000000a100000000000
3474688686750137226366436210114trqvqsd00000100000v100000000000